Overview

Dataset statistics

Number of variables19
Number of observations200000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory29.0 MiB
Average record size in memory152.0 B

Variable types

Numeric10
Categorical9

Warnings

price_vnd is highly skewed (γ1 = 235.7020306) Skewed
sim_number has unique values Unique
same_of_a_kind has 182027 (91.0%) zeros Zeros
straight has 189488 (94.7%) zeros Zeros
same_of_a_kind_middle has 170845 (85.4%) zeros Zeros
straight_middle has 188280 (94.1%) zeros Zeros
taxi has 174881 (87.4%) zeros Zeros
reserve has 197897 (98.9%) zeros Zeros
reserve_middle has 191713 (95.9%) zeros Zeros
last_number has 11894 (5.9%) zeros Zeros

Reproduction

Analysis started2022-12-10 21:38:32.599290
Analysis finished2022-12-10 21:39:06.449285
Duration33.85 seconds
Software versionpandas-profiling v2.12.0
Download configurationconfig.yaml

Variables

price_vnd
Real number (ℝ≥0)

SKEWED

Distinct939
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13950269.82
Minimum99000
Maximum1.68 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:06.520351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum99000
5-th percentile450000
Q1500000
median1000000
Q35000000
95-th percentile15000000
Maximum1.68 × 1011
Range1.67999901 × 1011
Interquartile range (IQR)4500000

Descriptive statistics

Standard deviation591299676.8
Coefficient of variation (CV)42.38625379
Kurtosis63791.3299
Mean13950269.82
Median Absolute Deviation (MAD)550000
Skewness235.7020306
Sum2.790053963 × 1012
Variance3.496353078 × 1017
MonotonicityNot monotonic
2022-12-11T04:39:06.767575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45000045119
22.6%
100000038472
19.2%
50000030619
15.3%
300000029199
14.6%
500000021383
10.7%
100000006139
 
3.1%
120000005124
 
2.6%
113250002682
 
1.3%
110000001760
 
0.9%
130000001720
 
0.9%
Other values (929)17783
 
8.9%
ValueCountFrequency (%)
9900058
< 0.1%
11900032
< 0.1%
19900018
 
< 0.1%
2200001
 
< 0.1%
25000047
< 0.1%
ValueCountFrequency (%)
1.68 × 10111
< 0.1%
1.65 × 10111
< 0.1%
5.5 × 10101
< 0.1%
4.9 × 10101
< 0.1%
4.5 × 10101
< 0.1%

sim_number
Real number (ℝ≥0)

UNIQUE

Distinct200000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean730479740.9
Minimum325009779
Maximum997979696
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:06.907702image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum325009779
5-th percentile329600084.4
Q1392260526
median833428672.5
Q3918087387.5
95-th percentile977250001
Maximum997979696
Range672969917
Interquartile range (IQR)525826861.5

Descriptive statistics

Standard deviation240647935.3
Coefficient of variation (CV)0.3294382059
Kurtosis-1.070182089
Mean730479740.9
Median Absolute Deviation (MAD)103409268.5
Skewness-0.7831819706
Sum1.460959482 × 1014
Variance5.791142876 × 1016
MonotonicityNot monotonic
2022-12-11T04:39:07.014800image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3431892881
 
< 0.1%
3820569691
 
< 0.1%
8172122271
 
< 0.1%
8771509791
 
< 0.1%
7757755311
 
< 0.1%
8198579991
 
< 0.1%
9892834141
 
< 0.1%
9781110431
 
< 0.1%
9679379921
 
< 0.1%
9462378331
 
< 0.1%
Other values (199990)199990
> 99.9%
ValueCountFrequency (%)
3250097791
< 0.1%
3250119801
< 0.1%
3250119811
< 0.1%
3250120061
< 0.1%
3250120101
< 0.1%
ValueCountFrequency (%)
9979796961
< 0.1%
9979791911
< 0.1%
9979689991
< 0.1%
9979689681
< 0.1%
9979509991
< 0.1%

provider
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
viettel
88584 
mobifone
51854 
vinaphone
40694 
vietnamobile
11597 
itelecom
 
6899

Length

Max length12
Median length8
Mean length7.99063
Min length7

Characters and Unicode

Total characters1598126
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowviettel
2nd rowvinaphone
3rd rowvietnamobile
4th rowmobifone
5th rowviettel
ValueCountFrequency (%)
viettel88584
44.3%
mobifone51854
25.9%
vinaphone40694
20.3%
vietnamobile11597
 
5.8%
itelecom6899
 
3.4%
gmobile372
 
0.2%
2022-12-11T04:39:07.193963image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:07.251015image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
viettel88584
44.3%
mobifone51854
25.9%
vinaphone40694
20.3%
vietnamobile11597
 
5.8%
itelecom6899
 
3.4%
gmobile372
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e307080
19.2%
i211597
13.2%
t195664
12.2%
o163270
10.2%
n144839
9.1%
v140875
8.8%
l107452
 
6.7%
m70722
 
4.4%
b63823
 
4.0%
a52291
 
3.3%
Other values (5)140513
8.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1598126
100.0%

Most frequent character per category

ValueCountFrequency (%)
e307080
19.2%
i211597
13.2%
t195664
12.2%
o163270
10.2%
n144839
9.1%
v140875
8.8%
l107452
 
6.7%
m70722
 
4.4%
b63823
 
4.0%
a52291
 
3.3%
Other values (5)140513
8.8%

Most occurring scripts

ValueCountFrequency (%)
Latin1598126
100.0%

Most frequent character per script

ValueCountFrequency (%)
e307080
19.2%
i211597
13.2%
t195664
12.2%
o163270
10.2%
n144839
9.1%
v140875
8.8%
l107452
 
6.7%
m70722
 
4.4%
b63823
 
4.0%
a52291
 
3.3%
Other values (5)140513
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1598126
100.0%

Most frequent character per block

ValueCountFrequency (%)
e307080
19.2%
i211597
13.2%
t195664
12.2%
o163270
10.2%
n144839
9.1%
v140875
8.8%
l107452
 
6.7%
m70722
 
4.4%
b63823
 
4.0%
a52291
 
3.3%
Other values (5)140513
8.8%

same_of_a_kind
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.29203
Minimum0
Maximum9
Zeros182027
Zeros (%)91.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:07.317075image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9424504869
Coefficient of variation (CV)3.227238595
Kurtosis7.923821423
Mean0.29203
Median Absolute Deviation (MAD)0
Skewness3.047644246
Sum58406
Variance0.8882129202
MonotonicityNot monotonic
2022-12-11T04:39:07.400150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0182027
91.0%
314103
 
7.1%
43366
 
1.7%
5419
 
0.2%
664
 
< 0.1%
715
 
< 0.1%
85
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
0182027
91.0%
314103
 
7.1%
43366
 
1.7%
5419
 
0.2%
664
 
< 0.1%
ValueCountFrequency (%)
91
 
< 0.1%
85
 
< 0.1%
715
 
< 0.1%
664
 
< 0.1%
5419
0.2%

straight
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16532
Minimum0
Maximum7
Zeros189488
Zeros (%)94.7%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:07.479222image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7085914254
Coefficient of variation (CV)4.286180894
Kurtosis16.21791883
Mean0.16532
Median Absolute Deviation (MAD)0
Skewness4.173876487
Sum33064
Variance0.5021018081
MonotonicityNot monotonic
2022-12-11T04:39:07.552289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0189488
94.7%
39222
 
4.6%
41098
 
0.5%
5153
 
0.1%
632
 
< 0.1%
77
 
< 0.1%
ValueCountFrequency (%)
0189488
94.7%
39222
 
4.6%
41098
 
0.5%
5153
 
0.1%
632
 
< 0.1%
ValueCountFrequency (%)
77
 
< 0.1%
632
 
< 0.1%
5153
 
0.1%
41098
 
0.5%
39222
4.6%

same_of_a_kind_middle
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.48057
Minimum0
Maximum8
Zeros170845
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:07.633362image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.186439848
Coefficient of variation (CV)2.468817962
Kurtosis3.453718664
Mean0.48057
Median Absolute Deviation (MAD)0
Skewness2.211302446
Sum96114
Variance1.407639513
MonotonicityNot monotonic
2022-12-11T04:39:07.716437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0170845
85.4%
322574
 
11.3%
44799
 
2.4%
51530
 
0.8%
6219
 
0.1%
732
 
< 0.1%
81
 
< 0.1%
ValueCountFrequency (%)
0170845
85.4%
322574
 
11.3%
44799
 
2.4%
51530
 
0.8%
6219
 
0.1%
ValueCountFrequency (%)
81
 
< 0.1%
732
 
< 0.1%
6219
 
0.1%
51530
 
0.8%
44799
2.4%

straight_middle
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.19016
Minimum0
Maximum7
Zeros188280
Zeros (%)94.1%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:07.795510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7750239852
Coefficient of variation (CV)4.075641487
Kurtosis15.37875478
Mean0.19016
Median Absolute Deviation (MAD)0
Skewness4.026287051
Sum38032
Variance0.6006621777
MonotonicityNot monotonic
2022-12-11T04:39:07.869577image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0188280
94.1%
39632
 
4.8%
41398
 
0.7%
5608
 
0.3%
670
 
< 0.1%
712
 
< 0.1%
ValueCountFrequency (%)
0188280
94.1%
39632
 
4.8%
41398
 
0.7%
5608
 
0.3%
670
 
< 0.1%
ValueCountFrequency (%)
712
 
< 0.1%
670
 
< 0.1%
5608
 
0.3%
41398
 
0.7%
39632
4.8%

fortune
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
182087 
1
 
17913

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%
2022-12-11T04:39:08.048739image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:08.099786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%

Most occurring characters

ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0182087
91.0%
117913
 
9.0%

wealth
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
181325 
1
 
9369
2
 
9306

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row2
5th row0
ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%
2022-12-11T04:39:08.237912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:08.288958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%

Most occurring characters

ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0181325
90.7%
19369
 
4.7%
29306
 
4.7%

land
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
192510 
2
 
6061
1
 
1429

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%
2022-12-11T04:39:08.455901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:08.508949image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%

Most occurring characters

ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0192510
96.3%
26061
 
3.0%
11429
 
0.7%

taxi
Real number (ℝ≥0)

ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.29921
Minimum0
Maximum8
Zeros174881
Zeros (%)87.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:08.563999image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9406422274
Coefficient of variation (CV)3.14375264
Kurtosis13.73052065
Mean0.29921
Median Absolute Deviation (MAD)0
Skewness3.662991601
Sum59842
Variance0.8848077999
MonotonicityNot monotonic
2022-12-11T04:39:08.644071image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0174881
87.4%
19596
 
4.8%
35433
 
2.7%
25300
 
2.6%
53534
 
1.8%
4957
 
0.5%
6260
 
0.1%
723
 
< 0.1%
816
 
< 0.1%
ValueCountFrequency (%)
0174881
87.4%
19596
 
4.8%
25300
 
2.6%
35433
 
2.7%
4957
 
0.5%
ValueCountFrequency (%)
816
 
< 0.1%
723
 
< 0.1%
6260
 
0.1%
53534
1.8%
4957
 
0.5%

birth_date
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
175690 
6
22473 
8
 
1837

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%
2022-12-11T04:39:08.820231image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:08.880286image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%

Most occurring characters

ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0175690
87.8%
622473
 
11.2%
81837
 
0.9%

mirror
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
187280 
2
 
11256
3
 
1371
4
 
93

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%
2022-12-11T04:39:09.033426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:09.089476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0187280
93.6%
211256
 
5.6%
31371
 
0.7%
493
 
< 0.1%

legacy
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
170587 
1
29413 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0
ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%
2022-12-11T04:39:09.234609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:09.286656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%

Most occurring characters

ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0170587
85.3%
129413
 
14.7%

memorable
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
178877 
1
 
10045
2
 
7210
3
 
3868

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%
2022-12-11T04:39:09.427784image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:09.482833image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%

Most occurring characters

ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0178877
89.4%
110045
 
5.0%
27210
 
3.6%
33868
 
1.9%

reserve
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.032885
Minimum0
Maximum8
Zeros197897
Zeros (%)98.9%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:09.537883image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3227446322
Coefficient of variation (CV)9.814341863
Kurtosis107.1418722
Mean0.032885
Median Absolute Deviation (MAD)0
Skewness10.10661549
Sum6577
Variance0.1041640976
MonotonicityNot monotonic
2022-12-11T04:39:09.612952image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0197897
98.9%
31911
 
1.0%
4151
 
0.1%
617
 
< 0.1%
516
 
< 0.1%
76
 
< 0.1%
82
 
< 0.1%
ValueCountFrequency (%)
0197897
98.9%
31911
 
1.0%
4151
 
0.1%
516
 
< 0.1%
617
 
< 0.1%
ValueCountFrequency (%)
82
 
< 0.1%
76
 
< 0.1%
617
 
< 0.1%
516
 
< 0.1%
4151
0.1%

reserve_middle
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.12792
Minimum0
Maximum7
Zeros191713
Zeros (%)95.9%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:09.690022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6195469254
Coefficient of variation (CV)4.843237378
Kurtosis21.39643248
Mean0.12792
Median Absolute Deviation (MAD)0
Skewness4.745225571
Sum25584
Variance0.3838383928
MonotonicityNot monotonic
2022-12-11T04:39:09.760086image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0191713
95.9%
37715
 
3.9%
4463
 
0.2%
572
 
< 0.1%
632
 
< 0.1%
75
 
< 0.1%
ValueCountFrequency (%)
0191713
95.9%
37715
 
3.9%
4463
 
0.2%
572
 
< 0.1%
632
 
< 0.1%
ValueCountFrequency (%)
75
 
< 0.1%
632
 
< 0.1%
572
 
< 0.1%
4463
 
0.2%
37715
3.9%

repeat
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
191248 
1
 
6242
2
 
2510

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters200000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%
2022-12-11T04:39:09.943252image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
2022-12-11T04:39:09.999303image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%

Most occurring characters

ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number200000
100.0%

Most frequent character per category

ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common200000
100.0%

Most frequent character per script

ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII200000
100.0%

Most frequent character per block

ValueCountFrequency (%)
0191248
95.6%
16242
 
3.1%
22510
 
1.3%

last_number
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.72672
Minimum0
Maximum9
Zeros11894
Zeros (%)5.9%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2022-12-11T04:39:10.251532image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median6
Q38
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.88038878
Coefficient of variation (CV)0.502973566
Kurtosis-0.9523011845
Mean5.72672
Median Absolute Deviation (MAD)2
Skewness-0.5458427771
Sum1145344
Variance8.296639525
MonotonicityNot monotonic
2022-12-11T04:39:10.318594image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
942880
21.4%
834739
17.4%
628169
14.1%
714951
 
7.5%
514515
 
7.3%
414139
 
7.1%
313661
 
6.8%
212675
 
6.3%
112377
 
6.2%
011894
 
5.9%
ValueCountFrequency (%)
011894
5.9%
112377
6.2%
212675
6.3%
313661
6.8%
414139
7.1%
ValueCountFrequency (%)
942880
21.4%
834739
17.4%
714951
 
7.5%
628169
14.1%
514515
 
7.3%

Interactions

2022-12-11T04:38:51.860231image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.020222image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.170219image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.321356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.504522image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.649813image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:52.886957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.029086image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.182225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.344373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.490505image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.626629image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.794783image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:53.944919image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.091052image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.243190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.393326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.549469image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.697603image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.851743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:54.985865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.118986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.255110image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.391233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.525355image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.661479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.807612image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:55.945737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.102880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.247011image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.381133image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.609351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.748467image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:56.903608image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.051743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.214892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.365028image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.527175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.676311image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.813435image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:57.957566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.100696image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.244828image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.387958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.542098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.690232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.847376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:58.991506image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.129632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.272762image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.417895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.560023image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.706156image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:38:59.860296image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.013436image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.175583image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.325720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.461843image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.613981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:00.783135image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.055382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.207520image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.362662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.510796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.678949image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.823080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:01.957202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.100332image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.259477image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.401607image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.541733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.689868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:02.854017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.050196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.203334image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.342462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.484590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.631725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.772852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:03.911979image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.066119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.219258image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.369395image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.513526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.644646image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.802789image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:04.935910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:05.064026image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:05.214163image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-11T04:39:05.344281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-12-11T04:39:10.414680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-11T04:39:10.594231image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-11T04:39:10.766733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-11T04:39:10.936888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-12-11T04:39:11.089026image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-12-11T04:39:05.542461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-11T04:39:05.950832image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

price_vndsim_numberprovidersame_of_a_kindstraightsame_of_a_kind_middlestraight_middlefortunewealthlandtaxibirth_datemirrorlegacymemorablereservereserve_middlerepeatlast_number
0450000343189288viettel0000000000000008
13000000888899580vinaphone0040000000000000
2500000928960006vietnamobile0030000000000006
35000000902438679mobifone0000020000100009
4450000334307889viettel0000000000000009
5450000328190680viettel0000000060000000
612000000926052005vietnamobile0000000080000005
73000000921220924vietnamobile0000000000000004
8500000877001866itelecom0000000000000006
9199000000769889999mobifone4030000102000029

Last rows

price_vndsim_numberprovidersame_of_a_kindstraightsame_of_a_kind_middlestraight_middlefortunewealthlandtaxibirth_datemirrorlegacymemorablereservereserve_middlerepeatlast_number
199990450000336083085viettel0000000000010005
19999110300000888883235vinaphone0050000000000005
199992500000926290568vietnamobile0000100000000008
1999931000000945510776vinaphone0000000000000006
1999943000000977622004viettel0000000060100004
199995450000866161769viettel0000000000000009
1999961000000708124126mobifone0000000000010006
1999971000000904755200mobifone0000000000100000
199998450000329220204viettel0000000060000004
1999991000000904760321mobifone0000000000103001